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Summary 

Our aim is to detect mechanistic interaction between the effects of two causal factors 
on a binary response, as an aid to identifying situations where the effects are mediated 
by a common mechanism. We propose a formalization of mechanistic interaction 
which acknowledges asymmetries of the kind "factor A interferes with factor B, but 
not viceversa". A class of tests for mechanistic interaction is proposed, which works 
on discrete or continuous causal variables, in any combination. Conditions under 
which these tests can be applied under a generic regime of data collection, be it 
interventional or observational, are discussed in terms of conditional independence 
assumptions within the framework of Augmented Directed Graphs. The scientific 
relevance of the method and the practicality of the graphical framework are illustrated 
with the aid of two studies in coronary artery disease. Our analysis relies on the "deep 
determinism" assumption that there exists some relevant set V — possibly unobserved 
— of "context variables" , such that the response y is a deterministic function of the 
values of V and of the causal factors of interest. Caveats regarding this assumption in 
real studies are discussed. 



1 Introduction 

Let the binary random variable Y indicate occurrence {Y=l) or non-occurrence 
(y=0) of an outcome event of interest, and let Y depend causally (in a sense 
to be later clarified) on factors A and B. Also consider a real but possibly 



1 



unobservable variable or set of variables V, which collude with A and B to 
cause the response Y, as illustrated by the directed graph of Figure [TJi^aJ. In 
general, even were we to know A, B and V, the response Y would not be 
fully determined, but would retain an element of random variation. In certain 
applications, however, it might be reasonable to assume that there exists some 
relevant set of variables V, which we will term context variables, such that the 
binary response Y is fully determined, without further variation, by V and the 
values we impose on A and B. More precisely, consider the collection of (real or 
hypothetical) interventional regimes where we force A and B to take on some 
configuration (a, b). Then the assumption is that, under such regimes, we have: 

Y = f{A,B,V) (1) 

for some (typically unknown) function /. Thus, for any value of V, the (a, b) 
configuration which we force upon (A, B) will precisely dictate whether or not 
the event Y = 1 will occur. We call this assumption deep determinism. 

If we can perform an experiment, setting A and B to specific values and ob- 
serving the corresponding Y outcomes (but not observing V), the resulting data 
may help us predict the effect upon Y of intervening on A and/or B. But we 
can probe more deeply. We can investigate context- specific causal effects — the 
effects of A and B upon K in a context determined by some given value v for 
V . For example, if A and B are logical variables, then for any fixed value v 
of V the / function of Equation ^ will take one of sixteen possible Boolean 
patterns, such as, for example, Y=A\/ B, or Y=A/\B, and so on. Under appro- 
priate assumptions, the researcher may be able to infer that a certain pattern 
occurs in a random individual with positive probability. If the pattern is, say, 
Y=Af\B — a pattern where the two effects are interdependent — one might take 
this as evidence that, in certain circumstances, A and B operate in the same 
mechanism. [12], [13], [18], [1^, [19l and 15] have explored this territory, and 
proposed a series of empirical conditions for "interdependence" of binary vari- 
ables focused on mechanistic interaction. [17] extends this theory to multi-level 
ordered categorical factors. 

The mathematical form of the tests proposed here is similar to those that the 
above authors have proposed for discrete causal factors. However, by intro- 
ducing novel assumptions, we derive tests valid in the more general case of 
categorical and continuous causal factors, in any combination. 

We also provide a different justification and different assumptions for infer- 
ence about mechanism, in a framework built around the above notion of deep 
determinism. 

Section [2] introduces the concept of interference to capture the idea of two 
variables, A and B, influencing Y by operating through the same mechanism; 
this concept allows for asymmetry in the way A and B interact. Thus we say 
that B interferes with A in producing the event Y=l when A and B are both 
causal factors for Y , and there exists a possible intervention on B which has the 
power of preventing any intervention on A from causing the event Y=l. This 
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Figure 1: (a) our initial problem setting, (b) assumptions about the relationships 
between different regimes of data collection are added by the inclusion of intervention 
indicators in the graph, as discussed in Section |3l (c) the effects of A and B on F are 
jointly, but not individually, unconfounded. 
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can occur without also having A interfering with B. We talk of weak coaction 
[resp., strong coaction] when at least one [resp., either] of A and B interferes 
with the other. 

The above concepts are defined in terms of the behaviour of the system un- 
der a (real or hypothetical) interventional regime, where A and B are forced to 
take on specific value values. However in Section [B] we show that the proposed 
tests can be applied to data collected under under other regimes, e.g. observa- 
tional. In Section O the conditions under which these tests are meaningful are 
studied in terms of conditional independence properties of an Augmented Di- 
rected Acyclic Graph (ADAG) representation of the problem ([S]). The ADAG 
will simultaneously represent the consensus causal theory about the system un- 
der study, and assumptions about the behaviour of the system across different 
regimes of data collection. ADAGs are briefly reviewed in Section [3] The sci- 
entific relevance of the method and its practicality in complex study designs 
are illustrated with the aid of two studies of the molecular determinants of 
coronary artery disease, one of numerous areas in biomedical research where an 
assumption of deep determinism could be defensible. 

2 Interference and coaction 

Henceforth we make the deep determinism assumption of Equation ([T|). The set 
of possible values of A [resp., B, V] is denoted by A [resp., B, V]. 

Definition 2.1 (Irrelevance) Factor B is (causally) irrelevant to Y in context 
V=v, given A, if f{a, b, v) = f{a, 6', v) for all a G A,b,b' G B. 

Definition 2.2 (Interference) We say that A interferes with B in producing 
the event Y=l if, in some context V=v, B is not irrelevant to Y given A and, 
for some a G A and all b £ B, 

fia,b,v)=0. (2) 



That is, in that context, there exists a value a such that, when we set A=a, the 
event Y=l will never happen, whatever value we impose on B. 

Definition 2.3 (Weak coaction) We say that A and B weakly coact to pro- 
duce the event Y=l if at least one of A and B interferes with the other to produce 
the event Y=l. 



Definition 2.4 (Strong coaction) We say that A and B strongly coact to 
produce the event Y = 1 if each of A and B interferes with the other to produce 
the event Y = 1. 
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Figure 2: Electrical circuit illustration of coaction asymmetry. Imagine that an elec- 
trical voltage is applied between pins Q and 3^. Let Y = 1 indicate absence of current 
between these two pins. Let Y = indicate presence of current between these two 
pins. See main text for discussion of this example. 

Example (Logical): Under a regime of intervention on variables A G {0, 1, 2} 
and B e {0, 1}, let the binary response Y depend on these two variables ac- 
cording to the logical law Y = {A=2) V {{A=l) A (-8=1)). Neither of A or B is 
irrelevant to Y. Setting A to the value will prevent the event Y^l, whatever 
the value we impose on B. However, when A is set to value 2, event Y=l will 
happen whatever value we impose on B. Hence B does not interfere with A, 
while A interferes with B, in producing the event Y=l. Thus, A and B coact 
weakly (but not strongly) in producing the event Y=l. 

Example (Electrical) : Consider the circuit of Figure [2l where we imagine 
an electrical voltage applied between pins G and y, and we take Y = 1 [resp., 
F = 0] to indicate that current flows [resp., does not flow] between these two 
pins. Let the context variable be U, describing the unobserved state of the 
JJ-switch, each of the two possible states (OPEN, CLOSED) having positive 
probability. Let variable A index the four possible configurations of the A— 
switches, and variable B the position of the _B-switch. The flow of current 
depends on the configuration of the switches via the well known deterministic 
laws of electrical circuits: this model thus satisfies deep determinism. Then 
in context U= CLOSED, variable B is not irrelevant to Y since, when Ai is 
open and A2 closed, acting on B will have an effect on current flow. However, 



when A2= is open, no intervention on the i?-switch can restore the current 
flow. Hence, in context U= CLOSED, variable A interferes with B in producing 
current flow. 

Example (Binary): If A and B are binary. Equation ([ij implies that, for a 
given value v oi V, the function / takes one of sixteen possible patterns. First 
consider patterns Y = TRUE, Y = FALSE, Y =A, Y=A, Y=B and YJB. In 
all these patterns, at least one of A or i? is irrelevant to Y, and therefore, by 
definition, neither of A and B interferes with the other in producing the event 
Y=l. Next consider patterns Y=A\/B, YJAvB, Y=A\/B and YJAwB, where 
the disjunctive form implies neither factor interferes with the other. Finally 
consider patterns Y=A A B, Y=A A B, Y=A A B, Y=A A B, Y={A=B) and 
Y={A 7^ B), where neither of A and B is irrelevant, and where no value of A 
[resp., of B] produces the event Y=l unless B [resp.. A] takes on a particular 
value. Hence, in these last six patterns, each of A and B interferes with the 
other in producing the event Y=l. We conclude that, in the special case where 
A and B are binary, there can be no interference asymmetry between A and B: 
either they do or they do not interfere each with each other. Thus in this case 
weak and strong coaction coincide, and are essentially equivalent to the notion 
of interdependence given by [18 . 

Example (Biological determinism): Suppose a genetic mutation A can in- 
duce a structural change in protein a, causing disease Y in certain individuals 
when the protein is expressed normally. Hence A is not irrelevant to Y. Muta- 
tion B, located in the promoter region of the coding gene of a, reduces the level 
of expression of a. As a consequence, in the above individuals, presence of B 
prevents any structural disfunctionality in protein a from causing the disease. 
In this case B interferes with A in causing disease Y — an example of what 
geneticists call "epistasis". 

We conclude this section with a remark. We have discussed "coaction to pro- 
duce". We could similarly have defined "coaction to prevent". Coaction to 
prevent does not imply coaction to produce, nor vice versa. The scientific ap- 
plication and question of interest will usually dictate interest in one of the two 
directions. 

3 Monotonicity 

Sometimes we may be able to make assumptions about the ordering of the 
values of Y in response to configurations of A and B. In the electrical exam- 
ple of the previous section, for example, increasing the number of switches in 
CLOSED position can never cause the current flow to be switched off. Some- 
times assumptions of this kind can be formulated as properties of monotonicity, 
as follows. 

Definition 3.1 The effect of A upon Y is said to be non-decreasing (with re- 
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spect to B) if, for any configuration {b,v) of (B, V), the following implication 
holds: f{a,b,v) = 1 AND a > a ^ f{a' ,b,v) = 1. 

Definition 3.2 The effect of A upon Y is said to be non-increasing (witli re- 
spect to B) if, for any configuration (6, w) of {B,V), the following implication 
holds: f{a, b,v)^0 AND a > a ^ f{a ,b,v) = 0. 

Definition 3.3 The effect of A upon Y is said to be monotonic (with respect 
to B) if it is either non- decreasing or non-increasing with respect to B. 

Definition 3.4 The effect of A upon Y is said to be consistent (with respect 
to B) if whenever, for any (01,02) pair, the inequality f{ai,b,v) > f{a2,b,v) 
holds for some (6, v) configuration, it holds for all {b, v) configurations. 

Clearly monotonicity implies consistency; conversely, under consistency we can 
re-order the values to yield monotonicity. [3] discuss the situation where a 
change in the value of A may give rise to a reversal of the effect of B upon out- 
come. Such qualitative interaction violates consistency. Some authors consider 
qualitative interaction to be interpretable in terms of mechanism. A formal 
test, different from the standard statistical test for departures from additivity, 
should be performed to assess whether a qualitative interaction could be due to 
chance variation. One such test has been proposed by f5] . The tests proposed in 
this paper, which also differ from standard statistical interaction tests, establish 
conditions for an interpretation of interaction in terms of mechanism without 
necessarily requiring that the underlying interaction be qualitative. 

4 Augmented Directed Acyclic Graphs 

Coaction has been defined under a (real or hypothetical) interventional regime. 
The tests for coaction we shall later propose may be applied more generally, 
such as when the data are observational. This, however, will require stringent 
assumptions, for example that V be conditionally independent of A and B and 
of the way these two variables have been generated. In many applications it will 
be possible, and is then helpful, to represent such assumptions, in combination 
with further assumptions based on our causal understanding of the problem, by 
means of an Augmented Directed Acyclic Graph (ADAG). 

Examples of ADAGs are given in Figure [T] Figure ^b ) is an ADAG special- 
isation of the simple problem setting of Figure [T]('aj. An important feature of 
ADAGs is inclusion of intervention indicators, exemplified in Figures [T]('&j-('c^ 
by nodes a a and as- These nodes take values indicating the particular regime, 
observational or experimental, under which the values of a corresponding do- 
main variable arise. With A and B binary, for example, each of a a and will 
have possible values in (0, 0, 1), the interpretation being that, when cja = 0, the 
variable A is generated randomly by Nature, under the circumstances govern- 
ing the observational data; while ga = a £ {0,1} indicates an interventional 
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setting in which value a is imposed on A; and similarly for B. Although regime 
indicators are not random variables, we can still query the AD AG, using the 
d-separation criterion of [5], or the equivalent moralisation criterion of |10| . to 
read off conditional independencies implied by the graph. These independen- 
cies will generally reflect properties of the system under study and judgements 
about the way we expect the system to behave under data collection regimes 
different from the actual one. The graphs of Figures [Ty^j and[T]('cJ, for example, 
embody the conditional independence property, expressed in the notation of 
[7]: Y AL{aA,crB) \ {A,B), read as "F is conditionally independent of {aAjCrs), 
given (A, By^ . This represents an assumed property of invariance across regimes: 
that once we know the values of A and B, the distribution of Y will not further 
depend on the regime of data collection, as represented by {aA,crB)- In other 
words, in these two examples, the distribution of Y does not depend on the way 
the (A, B) configuration has arisen, be it observationally or interventionally. 

5 The core conditions 

Identifiability conditions for mechanistic interaction are typically succinctly 
stated in terms of the effects of A and B on Y having to be "unconfounded" , 
conditional on some observed variable C. We adopt a different approach, assum- 
ing a consensus ADAG representation of the problem is available. Conditions 
for validity of the test proposed in the next section are then phrased in terms of 
conditional independence properties of the ADAG. This discipline allows us to 
be more precise in our claims than a formulation in terms of "no confounding" . 
Another advantage of the ADAG-based approach is that it makes it easier to 
relate the conditions for applicability of a test to the substantive assumptions 
about the problem. 

The assertion "the (joint) effects of A and B on Y are unconfounded" might 
be interpreted as saying that there exists an observed variable C such that the 
following two conditions are satisfied: 

CALa and YALa \ {A, B, C) (3) 

where a := ((T^,crs), with possible values a — (a, 6), corresponding to setting 
A = a, B = b, and cr = (0, 0), also denoted by cr = 0, when both A and B arise 
naturally. In this case we say that C is a sufficient covariate for the joint effects 
of A and B on Y {[9 ). In accordance with the "back-door criterion" of [TT| . 
under these conditions the joint causal effect of {A, B) on Y will be estimable 
from observational data when C is also observed. Note that these conditions 
need not imply that C is sufficient for the individual causal effects of each of A 
and B on Y (which would involve extending ([3]) to apply also when only one 
factor is intervened on, i.e. for cr of the form (a, 0) or (0, &)). Thus in cases (b) 
and (c) oi Figure [1] C=0 is sufficient for the joint effects of A and B on Y, but 
is sufficient for the individual effects only in Figured] (b). 
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However, neither sufficiency for the joint effects nor sufficiency for the individual 
effects is what we need to ensure apphcabihty of the test of the next section for 
a general regime of data collection. In our analysis, the additional observable 
variable C must be a function of the overall context variable V featuring in the 
"deep determinism" property ([T|). Thus we can consider V = {C,U), with C 
observed and U unobserved. 

We shall require the simultaneous validity of the following four core condi- 
tions: 

Definition 5.1 (Core conditions) There exists a (possibly empty) set C of 
observable context variables and a set U of (typically unobserved) context vari- 
ables such that: 

1. (deep determinism) Y=f{A,B,C,U) for some deterministic function f, 
which is the same no matter how the variables {A, B, C, U) are generated. 

2. riLcr I {A, B, C, U) 

3. UMA,B,(t) I C, 
I AALB I (C,cr). 

Whenever Condition 1 is satisfied, we say that Y is functional with respect to 
{A, B, C, U). Condition 2 essentially repeats the second part of Condition 1, but 
it is helpful to display it explicitly. Condition 3 says that, conditionally on C, 
variable U has the same distribution in all regimes, and is independent of A and 
B (this will hold, in particular, if the full context variable (C, U) has the same 
distribution in all regimes and is independent of A and B)] while Condition 4 
requires A and B to be independent, given C, in the observational regime (this 
property necessarily holding when A and B are set by intervention). 

The following theorem can be proved straightforwardly using general prop- 
erties of conditional independence ([7], [H])- 

Theorem 5.1 Core conditions 2 and 3 imply Y ALa \ {A,B,C). 

Our core conditions imply the second condition of Equation ([3]), but not the 
ffist. It seems useful and instructive to discuss the differences between the two 
sets of conditions with the aid of examples. In the following examples interest 
focuses on testing coaction of variables A and B in producing the event Y=\, 
based on observational data about variables { A, B, Y) and, sometimes, a further 
variable Z . 

Figures [T]('6j-(^cj satisfy the conditions of Equation ([3]) when C=0. In both 
these examples, the distribution of Y given {A, B) does not depend on the way 
the configuration of values of {A, B) is generated, be it observationally or by 
intervention. However, while Figure [Ty^ j satisfies the core conditions once we 
assume Y to be functional with respect to [A, B, U)), Figure[T]CcJ violates core 
condition 4. 
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Now consider the example of Figure [Sl'cj. If the researcher engaged in a test 
of coaction between A and B follows the "no confounding" conditions of Equa- 
tion dni), he or she will notice that these conditions are satisfied for C=0, and 
might therefore proceed to perform the test without conditioning on Z. By 
contrast, if the researcher follows the core conditions of Definition I5.1[ he/she 
will notice that ignoring Z (that is, setting C=0), is valid only under the as- 
sumption that Y is functional with respect to {A,B,U). This appears to be 
a tremendously stringent assumption, which we may accept only if, for every 
value of U, variable F is a deterministic function of {A, B, Z) and Z is a deter- 
ministic function of A. A more appropriate choice, according to the conditions 
of Definition 15. 11 is to set C=Z. The latter choice would make more sense from 
a further point of view, that is, it would test coaction of the effect of B (on 
Y) and the direct effect of A (on Y), unmediated by Z. In summary, in this 
example, the two sets of conditions lead to different choices, in the sense that 
the best choice according to the core conditions violates the "no confounding" 
conditions of Equation ^ . 

Many of the above considerations also apply to the example of Figure ^d). 
In particular, in this last example, setting C=0 would appear a safe option 
according to the 'no confounding" conditions of Equation ([3]). And it would, in 
addition, satisfy core conditions (2) to (4)- A possible difficulty with this choice 
would however arise when negotiating core condition 1. In the light of core 
condition 1, choice C=0 means we are ready to assume Y to be deterministic 
when we condition on {A, B, U), but not on Z. This is sensible only if we believe 
Z to be itself is a deterministic function of its predecessors in the graph. Neither 
does the option C=Z, in this example, solve the problem. For conditioning on Z 
will typically introduce dependence between U and A, violating core condition 
3. 

6 Testing coaction 

We now present a test for coaction of variables A and B in producing the event 
Y=l, assuming that there exists a (possibly empty) set C of observed variables 
such that the core conditions of the previous section are valid. We allow A 
and B to be ordered categorical or continuous variables, in any combination. 
If either variable is not binary, we consider some dichotimisation of its range. 
Thus for A we would choose a threshold ta and define a := {a (1 A : a > ta}, 
a := {a & A : a < ta}- Similarly for B we would have tb,I3,I3. We also use a 
to denote the truth- value (0 or 1) of the event A € a, etc. 

In the sequel, all probabilities are computed under the observational regime 

(7 = 0. 
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For i,j ~ 0, 1, let 

core condition 3 r, / \t->/ \ 

= / Rijc[u) P[u I C=c) U 

J u 

where, for any value u oi U, 

R,j^{u) P{Y=l \ a = i,(3 = j, C=c, U=u) 

= 1 a/ hP{Y=l\a,h,C=c,U=u) P{a,h\a^i,(3 = i,C=c,U=u) 



'13=] 

core conditions 1, 2 



?■ / b fi'^^ c, u) P{a, b \ a = j3 — C=c, U=u) 

core condition 3 /" /" ^ f{0',b,C,u) P{a,h \ C=c) 

Ja=i ■ J 13=] ■ P{oL = 1,13 = j I C=c) 
core condition 4 , ^ | ^ ^ ^^^) / ^ /(a, 6, c, 7.) P(M /? = j', C^c) 

(4) 



Definition 6.1 Variable A is said to be a-insensitive with respect to Y if the 
following implication is valid for all {b,c,u): 

IF f{a,b,c,u) = for some a e a AND a > a THEN f{a,b,c,u) = 

(5) 

We similarly define the /?-insensitivity property for B. Trivially a-insensitivity 
holds if a consists of a single point. We are now ready to state the main theorem: 

Theorem 6.1 Let the binary outcome variable Y depend on observed variables 
{A, B, C) and on unobserved variable U, where A and B are allowed to be ordered 
categorical or continuous, in any combination of these two types. Let the effect 
of A [resp., B] upon Y be monotonia with respect to B [resp., A], and suppose 
that, for some dichotomizations of A and B, and some value c of C : 



Riic — Rioc — Roic > 0. (6) 

Then under the core conditions and the a-insensitivity property for A, variable 
B interferes with A in producing the event Y = I. Similarly, whenever the 
j3 -insensitivity property holds for B, variable A interferes with B in producing 
the event Y = 1; in either case A and B weakly coact to produce the event Y=l. 

Proof Equation (|6| can be expressed as 

j [Riic{u) - Rwa{u) - Roic(u)]p{u I C=c) > 0. (7) 
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It follows that there is a positive probability of obtaining a value u* of U such that 
Riic{u*) — Rioc{u*) — Roic{u*) > 0; in particular, Riic{u*) — Rioc{u*) > 0. Thus, 
using (U), 



/ &P{a\ A e a,C=c) 



hf{a,b,c,u)P{b I B e l3,C=c) 



Jbf[a,b,c,u) P{b\B £ l3,C=c) 

b£l3 



>o. 



from which it follows that there exists a value ai G q such that 

/ hf{ai,b,c,u)P{b\Bei3,C=c)> [ _hf{ai,b,c,u*)P{b\BeP,C=c). (8) 

Since the left-hand-side of the above inequality is thus positive, and / = or 1, we 
must have 

f{ai,bi,u*,c) = 1 for some bi G /3. (9) 

Also we cannot have have /(oi, b,u* ,c) = 1 for all b £ (3, since in this case the right- 
hand side of ^ would equal 1, whereas the left-hand side can not exceed 1. We deduce 
that 

f{ai,b2,u*,c) — 0, for some 62 € /3. (10) 

Because Equation is symmetrical in A and B, we similarly obtain: 

/(fl2, &3, It* , c) — 1, for some 02 £ a and 63 £ /3, (11) 

/(as, 63, u*, c) = 0, for some 03 G a. (12) 

Under the assumed monotonicity of the effect of B upon Y, and remembering that /3 
lies above /3, Equations (|9))- (|10|l imply that / is non-decreasing with B for any con- 
figuration of (A,C,U). Similarly, Equations Hll [) - (|12|l imply that / is non-decreasing 
with A for any configuration of (A,C,U). Equations @ ((TT| and dHJ tell us 

that there is a context {U, C) = {u* , c) where variables A and B are not irrelevant to 
Y with respect to each other. Then according to Definition l2.2l in order to prove that 
B interferes with A in producing the event Y=l, we only need prove that, for some 
value imposed on B, no value of A will produce the event Y=l, that is: 

f{a,b2,u,c) = Va. (13) 

In fact, the following two implications follow from Equation (|lUp : 

* r/ * 1 * \ f non-decreasing with A 

a < ai ^ j(a ,b2,u ,c) = 

* ^ A ATT^ r/ * I * \ j4 is o-insensitive wrt Y „ 

a > ai AND ai £ a ^ j(a , b2,u , c) — 



from which Equation (|13|l follows. We then conclude that, under an assumed a- 
insensitivity condition for A, Equation ^ implies that variable B interferes with 
A in producing the event Y — 1. Similarly we can prove that, under an assumed 
/3-insensitivity property for B, variable A interferes with B in producing the event 
y = 1, which completes the proof. 
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Remark 1: The theorem holds also in those situations where we can have its 
conditions satisfied by an appropriate recoding of A and B. 

Remark 2: The theorem can be applied conditional on the generic individual 
belonging to a particular population stratum defined on the basis of {A, B). 

The following example illustrates the two remarks above. Consider a discrete 
variable A G {1,2,3,4} and a continuous variable B on the (0,1) interval. 
Assume monotonicity of the effects of A and i? on F , and let us restrict attention 
to the stratum of individuals with A > \. Let us then recode variable A by 
setting A* := 5 - A. Then suppose for a : {A*=3} and P : {B > 0.5} that the 
data strongly support the inequality Rn — Rqi — Riq > 0. Because a consists 
of a single point, and therefore A is a-insensitive with respect to Y , we may 
conclude that B interferes with A in producing the event Y=l. The reverse 
inference, that A interferes with B, is possible if B is /3-insensitive with respect 
to Y , but this assumption may be problematic since /3 does not consist of a 
single point. 

7 Examples 

We discuss the examples of Figures [Slfaj — (b). 

Example of Figure \S^a) Let Y be an indicator of disease, depending on 
a pair (A, B) of genetic variants in linkage equilibrium with each other; and 
let covariate Z, representing genealogical information, say, be sufficient for the 
effects of A and B on Y. Then the graph of Figure [3](^aJ might be an acceptable 
representation of the problem. Suppose further there is consensus that Y is 
functional with respect to {A, B, [/), for example because the effects of the two 
variants on Y are thought to operate through a common molecular mechanism. 
Then the core conditions are satisfied if we take C = Z, and so observational 
{A, B, Z,Y) data can be used to test for A~B coaction to produce the event 
Y=l. 

Example of Figure [3]('b^ In this example, where C is necessarily empty, node 
U is not independent of A, which violates core condition 3. Consequently a set 
of observational {A, B, Y) data will typically not suffice for us to be able to test 
productive coaction of A and B by using the proposed method. 

8 Relations with previous work 

In certain formal frameworks for "statistical causality" , including Pearl's struc- 
tural equation formulation ([11], chapter 7) and the potential response frame- 
work of [14] , it is possible to construct a totally fictitious mathematical variable 
V which makes ^ true by mathematical fiat. Our approach differs in that we 
conceive of the context variable V as both real and relevant — and thus in prin- 
ciple observable; its relationships with the remaining variables in the problem 
need be negotiated and explicitly represented in the causal model. This has 
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practical consequences for data analysis. Consider, for example, Figure [3](^c j 
and (d). These two examples differ only in that, in the former, on the basis 
of contextual knowledge, we judge the unobserved "context variables" U that 
differentiate the possible behaviours of Y in response to (A, B) to be a priori 
unrelated to whereas in the latter example, these unknown variables are 
judged to act as unobserved confounders of the effect of Z upon Y . We have 
seen that this difference has consequences on our decision to apply the method, 
and whether or not we should condition on Z . 

Also, our method replaces the generic assumption of "the effect of A and B on 
Y is not confounded given C" with a formal set of independencies (the core 
conditions) that need to be satisfied by the causal model. We have seen in 
Section [5] that this formal method can capture important differences between 
different applications. 

9 Illustrative study: rsl333040 coacts with statins 

Within the Italian genetic study of early-onset myocardial infarction ([I]), be- 
tween 1996 and 2002, an incident sample of 2050 cases was selected on the basis 
of an hospitalization for myocardial infarction (MI) between age 40 and age 45, 
over a set of 125 Coronary Care Units spread nationwide. After entering the 
study, each sample subject produced a blood sample from which plasma was sep- 
arated and DNA extracted, and was then prospectively monitored for an average 
of 12 years of follow-up. Let the outcome of the follow-up be represented by a 
binary variable, Y , indicating whether a re-infarction or cardiovascular death 
were observed {Y=l), or not observed {Y=Q) within a period of 120 months from 
study entry. 

The research group agrees on the assumptions represented in the ADAG of 
Figure HI According to the graph, each case is characterized by the follow- 
ing variables. Variable G is a function of the genotype at rsl333040, a single 
nucleotide polymorphism (SNP) located in chromosomal region 9p21.3. We de- 
fine G to take value 1 in presence of two copies of the major rsl333040 allele, 
and value otherwise. Variable Z is the severity of coronaropathy at study 
entry. Variable T is the calendar year at study entry. Variable U represents 
a set of unknown confounders. Variable S indicates whether the subject was 
assigned to statin treatment right after study entry (5'=1) or never after study 
entry (<S'=0), and / indicates presence/ absence of hypercholesterolemia at study 
entry. Variable T here acts as a surrogate for relevant factors that vary with 
calendar time. These include therapy evolution, progress of medical knowledge 
and impact of legislation. These factors are assumed to influence both medi- 
cal practice, specifically concerning use of statins, and the clinical outcome Y . 
During the study period. National Guidelines concerning use of statins had not 
yet come into force, and the decision whether or not to administer statins to 
patients of the kind we are studying was taken more or less randomly by the 
recruiting Coronary Care Unit, though to some extent dependent on whether or 
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Figure 4: ADAG representation of our illustrative study of coaction between a gene 
tagged by single nucleotide polymorphism rsl333040 and statin treatment in producing 
myocardial infarction. See main text for a justification of the causal relationships 
depicted in this graph. 

not the patient was found to have hypercholesterolemia at study entry. This is 
accounted for in the graph by the I ^ S arrow. The graph also conservatively 
allows that susceptibility to hypercholesterolemia may depend on the genotype 
at the SNP of interest, although evidence in support of this has never been 
found. 

Instead of performing separate analyses within strata of (T, /) , we restrict anal- 
ysis to the stratum of patients with hypercholesterolemia (/=1), and assume 
that, in this stratum, the effect of T does not interact with G and S. We then 
model the effect of (G, S, T) on Y in the stratum of patients with 7 = 1 via the 
following linear risk Bernoulli model: 

f y ~ Bernoulh(7r), 

where St represents a linear effect of calendar year, in years since 1970. If 
our data provide evidence of a departure of parameter 7(s=o)x(G=i) from zero. 
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we say that variables S and G interact statistically in the stratum of hyperc- 
holesterolemic patients and, indeed, the results shown in Table [1] support this 
conclusion. The data seem to tell us that the beneficial effect of statins, in 
terms of reduction of risk of re-infarction in a hypercholesterolemic patient, is 
stronger in patients with G = 0. And that the highest risk is found in those 
hypercholesterolemic patients with G = who do not receive statins. 

Let's now shift from predictive to mechanistic inference, by examining whether 
the interaction between variables S and G can be upgraded from "statistical" 
to "mechanistic" . In order to do this, we need to use a different statistical test, 
and to be explicit about the set of (fairly strong) assumptions discussed in the 
previous section. One of these is monotonicity of the effects, which appears 
to be reasonable, since it does not require prior knowledge of the "deleterious" 
allele of the SNP. Next, we need to assume that the core conditions hold. Define 
C — (T, /) . With this choice, core conditions 2 to 4 are satisfied, although core 
condition 1 — that F be a deterministic function of (G, S, T, /, U) — could be 
problematic here unless we assume that, for any given value of U, variable G 
influences Z and Y through the same molecular mechanism whereby interference 
with the effect of statin takes place. After accepting the core conditions, in 
accordance with the theory of Section [51 we partition the possible values of the 
rsl333040 genotype into the set a and its complement a. We define a to indicate 
presence of two copies of the most frequent rsl333040 allele, corresponding to 
G = 1, so that a will represent the remaining two genotypic categories. We 
define /3 to indicate that the patient is given statins, corresponding to S" = 1, 
and we define /3 to indicate that the patient is not given statins, corresponding 
to S = 0. Since each of a and /3 contains just one value of the corresponding 
variable, a-insensitivity and /3-insensitivity hold in this case. 

It is easy to show that, given T = t, the above model implies Riit~Riat~Roit = 
7(s=o)x(G=i) — C( — 5tt. This quantity, according to Table [1] is significantly 
greater than zero for all relevant values of T. Hence, in the light of our theory 
and under the assumptions discussed above, we conclude that G and S strongly 
coact to produce re-infarction. The interpretation may be phrased in a number 
of ways. One is to say that there exists some context in which hypercholes- 
terolemic patients with the G = 1 genotype are safe from re-infarction, whether 
or not they take statins, whereas those with G — Q develop or avoid re- infarction 
depending on assumption of statins. A counterfactual rephrasing of this is to 
say that some patients with G = 0, who developed re-infarction, would not have 
developed it, had they received statins. All this can be interpreted to suggest 
that statins and some gene tagged by rsl333040 infiuence susceptibility to rein- 
farction through a common pathway, which motivates a future effort to identify 
which gene is this, and what is its function. Some researchers might have got 
to the same conclusions from the results of the regression analysis, without con- 
sideration of the theoretical framework proposed in this paper. In our opinion, 
that would be careless. Not only do such conclusions require a statistical test of 
the kind proposed in this paper, which differs from a standard interaction test. 
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but also, they require explicit consideration of the (fairly strong) assumptions 
we have discussed in this paper. 

10 Illustrative study: rs4620585 coacts with smok- 
ing 

Each of the cases in the study of the previous section was paired with a control, 
matched by age and geographical region of origin. After eliminating individ- 
uals with missing data, 1666 controls remained available for the analysis. In 
this section we concentrate on "smoking habit" , a binary indicator obtained by 
dichotomizing an (imprecisely recorded) daily number of cigarettes. We tested 
for possible coaction between smoking habit and one or more SNPs of a list of 
ten candidates from an independent study, an interesting signal being found at 
SNP rs4620585 of human chromosome 1, never previously been associated with 
a disease. The remaining discussion restricts attention to SNP rs4620585. Let 
A signify rare rs4620585 homozygosity (RRH), and B signify "smoker". Let Y 
represent occurrence of early MI. We assume that the core conditions, and in 
particular condition 4, hold in this problem, once we assume (in accord with 
current knowledge) that the gene implicated by rs4620585 has no influence on 
smoking habit or addiction to nicotine. 

On the basis of our data, we performed a linear-odds regression of the case- 
control indicator on SNP rs4620585 and smoking habit. This analysis yielded 
the estimated coefficients of Table [21 Because our "early MI" endpoint is rare, 
we may safely assume that the selection effect implicit in the case-control study 
affects the interaction parameter 7 and the intercept a, in principle estimable 
only through a prospective study, only through multiplication by a common, 
unknown, positive constant. Hence we may take positivity of (7 — a) to imply 
positivity of the linear combination Rn — Rqi — Riq of prospective risks. Since 
Table [2] shows the quantity Ru — i?oi ~ ^10 to be significantly greater than zero 
(no multiple testing adjustment), we conclude in favour of a potential mecha- 
nistic interaction between SNP rs4620585 and smoking. One interpretation of 
this result is to say that there are circumstances in which some patients, by 
virtue of a beneficial variant tagged by rs4620585, are safe from an early MI 
regardless of their smoking, whereas patients without that variant, who in the 
same circumstances developed an early MI, would have avoided it, had they not 
smoked. 

11 Discussion 

Statistical interaction — departure from some parametric model of independent 
effects of explanatory variables — is not necessarily interpretable as reflecting 
an underlying mechanism, not least because most statistical models are mathe- 
matical fictions ([!]). This is especially true when the modeller has to negotiate 
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continuous explanatory variables. Our proposed sufficient conditions for declar- 
ing coaction between continuous variables do not invoke specific parametric 
forms of dependence, and appear to provide a better basis for inference about 
mechanistic interaction. The proposed method does however, rely on the as- 
sumption that the mechanism studied is, at some deep level, deterministic — 
which is by no means universally appropriate, as shown by [5]. This assumption 
can, however, be defensible in some fields of application, and our choice of an 
illustrative study in molecular medicine reflects such concerns. 

Finally, we would re-iterate that, unlike previous approaches to the problem, the 
proposed method avoids artificial mathematical constructs based on a potential 
response paradigm of statistical causality. While some of our tests are math- 
ematically similar to previously proposed tests based on "principal stratum" 
arguments, our tests differ in that we insist the context variable V be both real 
and relevant. Although V may be wholly or partly unobserved, it is important 
in our method that it be, in principle at least, observable, and that its rela- 
tionships with the remaining variables in the problem explicitly represented in 
the causal model. With the aid of study examples, we have shown that such an 
exercise is necessary to differentiate situations in which the method is applicable 
from situations in which it is not. 
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Estimate 


Std. Error 


z value 


p- value 


a (intercept) -2.33 


0.5 


-4.66 


3e-« 


(f)G=\ (wild-type rsl333040 homozygous) -0.06 


0.19 


-0.31 


0.7 


(f)S=Q (no statin treatment) 1.41 


0.24 


5.85 


4e-09 


5t (linear effect of calendar year - 1970) -0.02 


0.017 


-1.52 


0.12 


7(s=o)x(G=i) -1-0 


0.33 


-3.0 


0.002 



Table 1: Parameter estimates from a linear-odds regression of the prospective bi- 
nary endpoint in our illustrative study (re-infarction within six years from the index 

infarction) upon variables S (the statin treatment indicator) and G (a function of 
the genotype at SNP rsl333040). Variable G is coded to take value 1 if the individ- 
ual carries two copies of the most frequent allele at single nucleotide polymorphism 
rsl333040. This table reports estimates for the parameters of the regression model, 
as obtained from an analysis of 1200 subjects who were hospitalized on the basis of a 
myocardial infarction between 40 and 45 years of age, and were found at that point 
to have hypercholesterolemia. These estimates suggest that, in patients with hyper- 
cholesterolemia, statins decrease the risk of re-infarction regardless of the rsl333040 
genotype (G), although their effect is stronger in patients with G = 0. At highest 
risk are those hypercholesterolemic patients with G = who do not receive statins. 
Because the quantity 7 — a is significantly greater than zero, we deduce that G and S 
interfere with each other (and hence strongly coact) to produce re-infarction. 



Standard 





Estimate 


Error 


z value 


2> value 


Intercept (a) 


0.25 


0.013 


18.7 


< 2e-^'* 


smoker 


1.46 


0.044 


33.1 


< 2e-i6 


rare rs4620585 homozygous (RRH) 


0.07 


0.056 


1.2 


0.19 


smoker x RRH (7) 


0.9 


0.22 


4.09 


4e-5 



Table 2: Parameter estimates from a linear-odds regression of the early MI indica- 
tor upon "smoking habit", obtained by dichotomizing an original "Daily number of 
cigarettes" variable and genotype at SNP rs4620585, based on the set of cases and 
controls of our Illustrative study. 
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